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5 Field Of The Invention 

The present invention relates generally to a method and apparatus to 
efficiently compute powers and multiples of quantities that are useful for 
cryptographic purposes such as, for example, key generation, encryption, 
decryption, authentication, identification, and digital signatures. 

10 

Background Of The Invention 

Modern cryptographic methods require massive numbers of basic 
arithmetic operations such as addition, subtraction, multiplication, division, 
remainder, shift, and logical 'and', 'or', and 'xor'. Many of these methods 
1 5 require computation of powers A k (respectively multiples k*A) for a value k that 
is selected at random from a large set of possible values. 



Heretofore a variety of methods have been proposed and implemented for 
computing powers A k (respectively multiples k*A) for a specified value of k (H. 

20 Cohen, A Course in Computational Number Theory, GTM 138, Springer- Verlag, 
1993 (Section 1.2)), (D. Gordon, A survey of fast exponentiation methods, 
Journal of Algorithms 27 (1998), 129-146), (D. Knuth, The Art of Computer 
Programming, Volume 2, Seminumerical Algorithms, 3rd ed., Addison- Wesley, 
1998 (Section 6.4.3)), (A.J. Menezes, et al., Handbook of Applied Cryptography, 

25 CRC Press, 1997 (Section 14.6)). 

One type of rapid computational method which has been used is to write 
k as a sum of powers of two to reduce the computation to multiplications and 
squarings (respectively additions and doublings). 



30 
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A second type of rapid computational method which has been used is to 
write k as a sum of plus and minus powers of two to reduce the computation to 
multiplications, inversions, and squarings (respectively additions, subtractions, 
and doublings). 

5 

A third type of rapid computational method which has been used is to 
write k as a sum of small multiples of powers of two to reduce the computation 
to multiplications, small powers, and squarings (respectively additions, 
subtractions, small multiples, and doublings). 

10 

A fourth type of rapid computation method which has been used is to 
write k as a sum of powers of a special multiplier t, where t has the property 
that raising to the t power (respectively multiplying by t) takes very little time 
compared to multiplication (respectively addition or subtraction). A particular 
15 instance of this method is writing k as a sum of powers of the p-power 

Frobenius map on the group of points E(GF(p m )) on an elliptic curve E defined 
over the finite field GF(p). 

A fifth type of rapid computation method which has been used (called the 
20 factor method (Knuth, supra, Section 4.6.3, page 463 and exercise 3)) is to write 
a given integer k as a product of factors which are themselves written using one 
of the previously described methods. 

All of these methods allow reasonably rapid computation of powers A k (or 
25 multiples k*A) for all or most values of k, but many users would find it 
desirable to have a method which allows even more rapid computation of 
powers A k (or multiples k*A) with k taken from a sufficiently large set of 
allowable values. 

30 Summary Of The Invention 

In many cryptographic operations one uses a random power or multiple 
of an element in a group or a ring. The present invention provides a method, 
system and apparatus for transforming digital information that uses a fast 
method to compute powers and multiples in certain important situations 
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including, for example, powers in the Galois field F 2 n , multiples on Koblitz 
elliptic curves, multiples in NTRU convolution polynomial rings, and the like. 
In accord with the invention, for example, an exponent or multiplier is 
expressed as a product of factors, each of which has low Hamming weight when 
5 expanded as a sum of powers of some fast operation. This is useful in many 
cryptographic operations including, for example, key generation, encryption, 
decryption, creation of a digital signature, verification of a digital signature, 
creation of a digital certificate, authentication of a digital certificate, 
identification, pseudorandom number generation and computation of a hash 
1 0 function, and the like, particularly in operations where a random exponent or 
multiplier is utilized. 



Thus, the present invention provides a method, system and apparatus 
for performing a cryptographic operation that comprises transforming digital 

15 information by operating with a digital operator having a component selected 
from a large set of elements, wherein the component has a low Hamming 
weight. Typically, components are selected to have a Hamming weight less than 
about 30, preferably less than about 20, more preferably less than about 15, 
and most preferably less than about 10. Preferably, the digital operator 

20 comprises a component having a plurality of factors, each factor having a low 
Hamming weight. By using a digital operator having one or more components 
in accord with the present invention, the transforming step can be performed at 
increased speed compared to the transforming step using prior art elements. A 
component in accord with the present invention is formed as a product of 

25 factors, each of which has low Hamming weight when expanded as a sum of 
powers of some fast operation (as performed by a suitable computing device 
such as, for example, a cpu, microprocessor or computer, or the like). 



In accord with the present invention, powers A k (or multiples k*A) can be 
30 computed very rapidly with k chosen from a large set of possible elements. 
Computation in accord with methods of the present invention typically is 
considerably faster than the most widely used methods presently in use. 
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In one embodiment of the invention, the computational technique for an 
exponential component (or for a component having multiplied elements) uses 
products of specially selected elements to speed the exponential calculations (or 
multiplication calculations) of the method. The computational complexity of 
5 computing a power A^W 2 ) - k i r ) (or a multiple k(l)k(2)...k(r)*A) is proportional to 
the sum of the computational complexity of computing the powers Ai k < 1 ), ... , 
A r k ( r > (or the multiples k(l)*Ai, ... , k(r)*A r ), although the number of possible 
powers (or multiples) is approximately equal to the product of the number of 
allowed values for k(l), ... , k(r). The fact that the computational complexity 

10 increases proportionally to the sum of the individual computational 

complexities although the size of the set of possible values of A k (respectively 
k*A) increases proportionally to the product of the number of allowed values for 
k(l), ... , k(r) helps to explain why the invention provides an improved 
computational method. The method is especially effective in situations wherein 

1 5 there is an element t for which powers A 1 (or multiples t*A) can be computed 
rapidly. Preferably, k(l), ... , k(r) are chosen to be low Hamming weight 
polynomials in the element t. 



The present invention also provides a computer readable medium 
20 containing instructions for performing the above-described methods of the 
invention. 



DEFINITIONS 

The following definition is used for purposes of describing the present 
25 inventions. A computer readable medium shall be understood to mean any 

article of manufacture that contains data that can be read by a computer or a 
carrier wave signal carrying data that can be read by a computer. Such 
computer readable media includes but is not limited to magnetic media, such 
as a floppy disk, a flexible disk, a hard disk, reel-to-reel tape, cartridge tape, 
30 cassette tape or cards; optical media such as CD-ROM and writeable compact 
disc; magneto-optical media in disc, tape or card form; paper media, such as 
punched cards and paper tape; or on carrier wave signal received through a 
network, wireless network or modem, including radio-frequency signals and 
infrared signals. 
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The term "large set" as used herein shall mean a set of elements that is 
large enough to prevent someone from checking, within a predetermined time or 
in less than a predetermined minimum number of operations, the elements of 
5 the set to discover a randomly chosen element in the set that is used in a 

cryptographic operation. The longer the time or larger the minimum number of 
operations required to discover the chosen element, the more secure is the 
cryptographic operation. 

10 Detailed Description Of The Invention 

There are many cryptographic methods that require a random power A k 
or random multiple k*A, where k is an element of a ring R and A is an element 
of an R-module M. Exemplary methods requiring such random powers or 
multiples include Diffie-Hellman key exchange (U.S. Patent No 4,200,770), 

15 (Menezes, supra, section 12.6.1), ElGamal public key cryptography (Menezes, 
supra, section 8.4), the digital signature standard (DSS) [U.S. Patent No. 
5,231,668), (Menezes, supra, section 11.5.1), the NTRU® public key 
cryptosystem (U.S. Patent No. 6,081,597), (J. Hoffstein, et at., NTRU: A new high 
speed public key cryptosystem, in Algorithmic Number Theory (ANTS III), 

20 Portland, OR, June 1998, Lecture Notes in Computer Science 1423 (J.P. 

Buhler, ed.), Springer-Verlag, Berlin, 1998, 267-288), and the NTRU® signature 
scheme (NSS) (J. Hoffstein, et al., NSS: The NTRU Signature Scheme, Proc. 
EUROCRYPT 2001, Lecture Notes in Computer Science, Springer-Verlag, 2001). 

25 One variant of Diffie-Hellman, ElGamal, and DSS uses the ring R = Z of 

integers and the R-module M = GF(p m )* of nonzero elements of a finite field. A 
second variant of Diffie-Hellman, ElGamal, and DSS uses the endomorphism 
ring R = End(A) of an abelian variety A and the R-module M = A(GF(p m )) of 
points on A defined over a finite field. An instance of this variant is an elliptic 

30 curve A = E defined over the finite field GF(p) and an endomorphism ring R that 
includes Z and the p-power Frobenius map on E(GF(p m )). The NTRU public key 
cryptosystem and NTRU signature scheme use a ring R = B[X]/I of polynomials 
with coefficients in a ring B modulo an ideal I and the R-module M = R. An 
instance of NTRU uses the convolution ring R = (Z/qZ)[X]/(X N -l). 
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The method of the invention involves choosing the quantity k from a set 
of elements of the ring R of the form 

k = k(l)*k(2)* ... *k(r) 

5 

wherein computation of the powers A k (») (or multiples k(i)*A) is computationally 
fast for every element A of the R-module M and to compute the power A k (or 
multiple k*A) as the sequence of steps 

10 Aj = AMD, A 2 = Al^l, A* = Ar = A r -ik(r), 

(respectively as the sequence of steps 

Ai = k(l)*A, A 2 = k(2)*Ai, ... , A r = k(r)*A r -i.) 

15 

In one embodiment of the invention, a ring R contains an element t so 
that computation of the power A 1 (respectively multiple t*A) is computationally 
fast for every element A of the R-moduie M. Examples of this include: 

(1) the multiplicative group of a finite field M = GF(p m )* and the 

20 element t = p corresponding to raising to the p* power, t(x) = xp; 

(2) an elliptic curve E defined over a finite field GF(p), the group of 
points M = E(GF(p m )) of E with coordinates in the extension field 
GF(p m ), and the element t that is the p* power Frobenius element 
defined by t(x,y) = (xp,yp); 

25 (3) the ring of convolution polynomials R = M = (Z/qZ)[X]/ (XN-1) and 

the element t = X corresponding to multiplication by X in the ring 
R, t(f(X)) = X*f(X). 

In the instance that the ring R contains such an element t, the elements k(i) 
30 preferably are chosen to be polynomials in t, 



ao + ai*t + a 2 *t 2 + ... + a n *t n , 
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wherein the coefficients ao, ... , a„ are chosen from a restricted set. Exemplary 
choices for ao, ... , a n are the sets {0,1} and {-1,0,1}- The latter is useful 
primarily when inversion (or negation) are computationally fast operations in M. 

5 The effectiveness of the invention can be measured by the Hamming 

weight HW of a polynomial: 

HW(ao + ai*t + ... + a n *t n ) = # of a 5 that are nonzero. 

1 0 Because computation of A 1 (or t*A) takes negligible time, the time to compute 
Ak(i) (or k(i)*A) for k(i) = a 0 + ai*t + ... + a n *t n is approximately 

TimeToCompute(A k W) « HW(k(i)) multiplications, 

1 5 respectively, 

TimeToCompute(k(i)*A) « HW(k(i)) additions. 

The method of choosing k in the form k = k(l)*k(2)* ... *k(r) thus allows 
20 computation of A k (or k*A) in approximately 

TimeToCompute(Ak) « HW(k(l)) + ... +HW(k(r)) + r - 1 multiplications, 

respectively, 

25 

TimeToCompute(k*A) « HW(k(l)) + ... +HW(k(r)) + r - 1 additions. 

Thus, the computational effort to compute A k (respectively k*A) is approximately 
proportional to the sum of the Hamming weights of the quantities k(l), ... , k(r). 

30 

If the polynomial coefficients ao, ... , ^ are chosen from the exemplary 
set {0,1} and if the quantities k(l), ... , k(r) are chosen to have Hamming weights 
di, ... , d r , respectively, then the number of r- tuples (k(l), ... , k(r)) is 



Page 8 

C(n+l,di)*C(n+l,d 2 )* ... *C(n+l,d r ), 



where C(n,d) = n!/d!*(n-d)! is the combinatorial symbol. Thus, the number of r- 
tuples (k(l), ... , k(r)) is the product of the number of individual values for each 
5 k(i). Further, in many exemplary situations, experiments show that, if the 
product C(n+l,di)* ... *C(n+l,d r ) is chosen to be smaller than the number of 
elements in the ring R, then most of the products k(l)* ... *k(r) will be distinct. 
Hence, random powers A k (or multiples k*A) can be efficiently computed for all 
k in a large subset of R. The specific size of the subset may be adjusted by 
10 suitable choices of parameters, such as the parameters r and di, ... , d r . 



A generalization of the embodiment described is a ring R that contains 
several elements ti, ... , t z so that computation of the power A 1 (or multiple t*A) 
is computationally fast for each t = ti , ... , t z and for every element A of the R- 
15 module M, and in which the elements k(i) are chosen to be polynomials in ti, ... 
, t z . Another generalization of the embodiment described is selection of 
elements 



k = k 3 (l)* ... *ki(n) + k 2 (l)* ... *k 2 (r 2 ) + ... + k w (l)* ... *k w (r w ) 

20 

that are sums of products of elements kj(i) of the sort k(i) previously described. 
Further generalizations will be readily apparent to those skilled in the art. 

Additional details of the speed enhanced cryptographic techniques in 
25 accordance with the present invention are described in the examples below. 

There are many cryptographic constructions in which one uses a 
random power or multiple of an element of a group or ring. A brief and far from 
complete list includes: 
30 1. Diffie-Hellman Key Exchange 

One takes an element g in a finite field F and computes a random 
power g k in F. Here k is an integer. 
2. Elliptic Curve DH Key Exchange 
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One takes a point P in the group 1L{F) of points on an elliptic curve 
over a finite field and computes a random multiple kP. Here k may be an 
integer or a more general endomorphism of the group E(F). 

3. DSS and ECDSS 

5 The Digital Signature Standard (using a finite field or an elliptic 

curve) requires a random power g k or multiple kP in the signing portion 
of the algorithm. The verification process also require a power or 
multiple, but for specified values of k, not random values. 

4. Classical ElGamal Public Key Cryptosystem 

1 0 ElGamal key generation requires computation of a power (3 = ai 

with a fixed base a and a randomly chosen exponent j that forms the 
secret key. Encryption requires computation of two powers a k and (5 k to a 
randomly chosen exponent k. Decryption requires computation of a 
power y j . 

15 5. Elliptic Curve ElGamal and Variants 

Key generation requires computation of a multiple Q = jP with a 
fixed point P in E,(F) and a randomly chosen multiplier j that forms the 
secret key. Encryption requires computation of two multiples kP and kQ 
to a randomly chosen multiplier k. Decryption requires computation of a 

20 multiple jR. Again k may be an integer or a more general endomorphism 

of the group K(F) . 

6. NTRU Public Key Cryptosystem 

The private key includes a random polynomial f(X) in the ring 
Rq = (Z/q Z)[X]/(X N - 1) of truncated polynomials modulo q. Encryption 
25 requires computation of a product r(X)h(X) in the ring R, where h(X) (the 

public key) is fixed and r(X) is random. Decryption requires computation 
of a product f(X)e(X) 

in the ring R, where e(X) is the ciphertext. 



30 In accord with one embodiment of the invention, a general method is 

described that in many situations allows random multiples to be computed 
more rapidly than previously described methods. Although not universally 
applicable, it can be used for many of the algorithms in the above list, including 
Diffie-Hellman over Galois fields F2 11 , elliptic curve cryptography over Koblitz 
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curves, and the NTRU cryptosystem. In accord with the invention, a random 
exponent or multiplier is formed as a product of factors, each of which has low 
Hamming weight when expanded as a sum of powers of some fast operation. 

5 Briefly, in accord with the present invention, the random multiplier is 

written as a product of terms, each of which is a sum of terms that are 
relatively easily computed. These multipliers are referred to as Small Hamming 
Weight Products (SHWP), because each term in the product has low Hamming 
weight relative to an easily computed operation. 

10 

Low Hamming Weight Exponents 

The use of low Hamming weight exponents has been studied in both RSA 
exponentiation (C.H. Lim, et al., Sparse RSA keys and their generation, preprint, 
2000) and in discrete logarithm algorithms (D. Coppersmith, et al., On the 

1 5 Minimum Distance of Some Quadratic Residue Codes, IEEE Transactions on 
Information Theory, vol. IT-30, No. 2, March 1984, 407-411; D.R. Stinson, 
Some baby-step giant-step algorithms for the low Hamming weight discrete 
logarithm problem, Mathematics of Computation, to be published), but always in 
the context of taking a single exponent k of small Hamming weight. The 

20 present invention uses a product k = kik 2 " * " k r of very low Hamming weight 
exponents and take advantage of the fact that the sample space of the product 
k is more-or-less the product of the sample spaces for ki, ... , k r , while the 
computational complexity (in certain situations) of computing a k is the sum of 
the computational complexity of computing ap. 

25 

The usual binary method to compute x k requires approximately log2 k 
squarings and HW(k) multiplications, where 

HW(k) = Hamming weight of k 

30 

is the number of ones in the binary expansion of k. The use of addition chains 
for k will often yield an improvement although, for very large values of k, it is 
difficult to find optimal chains. 
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An idea to compute random powers by precomputing a list of powers, 
taking a product of a random subset, and gradually supplementing the list 
using intermediate calculations was described by CP. Schnorr, Efficient 
identification and signatures for smart cards, in Advances in Cryptology (Crypto 
89), Santa Barbara, CA, August 1989, Lecture Notes in Computer Science 435, 
(G. Brassard, ed.), Springer- Verlag, Berlin, 1989, 239-252. Schnorr's method 
was broken by de Rooij at the parameter levels suggested in Schnorr (P. de 
Rooij, On the security of the Schnorr scheme using preprocessing, in Advances in 
Cryptology (Eurocrypt 90), Aarhus, Denmark, May 1990, Lecture Notes in 
Computer Science 473 (LB. Damgard, ed.), Springer- Verlag, Berlin, 1990, 71- 
80). 

Another method, the factor method, is briefly discussed by Knuth, supra 
(at 4.6.3, page 463 and exercise 3). 

The present invention provides an improvement over those prior art 
methods for many applications. In one embodiment, for example, k is a 
product k = uv and, in accord with the present invention, z = x k is computed as 
y = x u and z = y. This process can be repeated and interspersed with the 
binary method or the use of other addition chains. 

To illustrate another embodiment the present invention, let G be a group 
in which the quantity x k is to be computed. Suppose that we write the 
exponent k as a sum of products 



We compute x k as the product n« xk(i) > we compute each power x k M using the 
factor method with ki = f\n Ki,„, and we compute each power y K M (K(i,n) = K,,„) 
by using (say) the binary method. This requires approximately log2(k) squarings 
and approximately 



(i) 



»=1 i=l n=l 




(2) 
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For small values of k, one might ask for the decomposition Eq. (1) that 
minimizes Eq. (2). For larger values of k, one might ask for an algorithm that 
produces a reasonably small value of Eq. (2). However, that is not the focus of 
the present invention. 

5 

Both the goals and the analysis for the method of the present invention 
differ significantly from the exponentiation as described in Knuth, supra. The 
goal in Knuth is to describe efficient methods for computing x k for a given 
exponent k. The subsequent analysis gives theoretical upper and lower bounds 

1 0 for the most efficient method and algorithms for taking a given k and finding a 
reasonably efficient way to evaluate x k . The present goal is to find a collection 
of exponents k such that x k is easy to compute and such that the collection is 
sufficiently "random" and sufficiently large. This seemingly minor change in 
perspective from specific exponents to random exponents actually represents a 

1 5 major shift in the underlying questions and in the methods that are used to 
study them. 

There is a second important way in which the present invention differs 
from the factor method as described in Knuth. The present invention is 
20 directed to situations in which there is a "free" operation. By way of example, 
let G be a group and suppose that it is desired to compute x k using the factor 
method, where k = uv. The cost of computing x k is approximately 

(log2(k) squarings) + (HW(u) + HW(v) multiplications) 

25 

where we assume for simplicity that the two powers y = x u and z = y are 
computed using the binary method. Now suppose that the (finite) group G has 
order N and suppose that k is written as a product modulo N, say k = uv (mod 
N). Then y = x u and z = y will still give us the correct value z = x k , but now the 
30 cost is approximately 



(log2(uv) squarings) + HW(u) + HW(v) multiplications 
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If squaring and multiplication take approximately the same amount of 
time, then this method will probably be very bad because the product uv will be 
very large. 



5 On the other hand, if squaring is very fast, as it is for example in the 

Galois field F2 11 , then large values of u and v can be advantageous as long as u 
and v have small (binary) Hamming weight. This will be illustrated, for 
example, in three situations of cryptographic interest, namely, exponentiation 
in Galois fields F2 11 , multiplication on Koblitz elliptic curves, and multiplication 
10 in NTRU convolution rings Fq[X]/(X N — 1). These specific situations are 
described in detail below. Also discussed below are some of the issues 
surrounding the randomness of small Hamming weight products. Those skilled 
in the art will realize and be able to utilize this invention in many other 
applications. 

15 

Random Powers in Galois Fields F2 11 

In any group, the standard way to compute a power a k is to use the 
binary expansion of k. This reduces the computation of a k to approximately 
log2(k) squarings and HW(k) multiples, where on average HW(k) equals 
20 approximately V2 log2(k). (Using a signed binary expansion of k further reduces 
the number of multiplies, at the expense of an inversion.) 



Binary powering algorithms apply to any group, but the feature exploited 
by the present invention in l<2 n is the fact that squaring is essentially free 
25 compared to multiplication. Thus, if k is randomly chosen in the interval from 
1 to 2 n - 1, then computation of a k is dominated by the approximately n/2 
multiplications that are required. 



As indicated above, there are many cryptographic situations in which a 
30 person needs to compute a k for a fixed base a and some randomly chosen 
exponent k. Generally, a requirement is that k be chosen from a sufficiently 
large set that an exhaustive search (or more generally, a square root search 
such as Pollard's rho method) will be unable to determine k. Thus, suppose 
that one chooses k to have the form 
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n-l 

k = ^ k t ■ 2 l with k t E {0, 1} (3) 

i=0 

with a fixed binary Hamming weight d = £ k;. Then, the size of the search space 
of k is C(n,d). One typically wants the search space to have at least 2 160 
5 elements, because the running time will typically be proportional to the square 
root of the size of this space. See Stinson, supra, for a description of 
Coppersmith's baby-step giant- step algorithm to efficiently search this space in 
time proportional to V(t C(n/2, d/2)). 

10 For cryptographic purposes, a typical value for n is n -1000, which is 

dictated by the running time of sieve and index calculus methods for solving the 
discrete logarithm problem over i*2 n - Then, taking d = 25 gives a search space 
of size C(1000, 25) « and computation of a k requires 24 multiplications. 

1 5 The method in accord with the present invention is to choose k to be a 

product of terms with very low binary Hamming weight. (More generally, one 
can use a sum of such products.) To illustrate with the above value n ~ 1000, 
let k have the form 

k = k(Dk(2)k(3) ; 

20 where km has binary Hamming weight of 6, and k<2) and kP> each have binary 

Hamming weight of 7. Then, the search space for k, which is the product of the 
search spaces for the three factors, has order (C(1000, 6) C(1000, 7) C(1000, 7)) 
«2 165 , while computation of 

25 a k = (( a k(l))k<2))k(3) 

requires only 5 + 6 + 6= 17 multiplications. This represents a savings of 
approximately 29%. 

30 Preferably, a search space at least of order approximately 2 160 is required 

because the standard square root search attacks reduce the time to O(2 S0 ) 
(Stinson, supra). However, if k is a product of several low Hamming weight 
polynomials, it is not clear how one would set up a square root attack on the 
full space. Thus, if k = kWkf^kP), one can search (guess) the first two terms and 
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then use a square root attack for the third term. A second approach to solving 
a k = p for k is to transfer k( 3 ) to the other side. Thus, let i run through the space 
of all products kWkt 2 ) and let j run through the space of all k( 3 ) values and make 

tables of the values of a 1 and (Ji" 1 , where j -1 is the inverse of j modulo 2 n - 1. 
5 Then, the running time is proportional to the sum of the sizes of the two tables. 

In the example given above, this yields a running time proportional 
to (C( 1000,6) C(1000, 7)) + C(1000, 7) « 2 107 - 7 . However, in view of this search 
method, it is preferred to make k< 3 > considerably larger than kw and k< 2 >. Thus, 
10 if we select kW, k< 2 > and kP) to have Hamming weights 2, 2, 11, respectively, 

then, the first square root attack has time O(2 S0 °) and the second square root 
attack has time 0(2 84 3 ), while computation of a k requires only 12 
multiplications. 

1 5 The above discussion can be applied similarly to fields with p n elements 

using multipliers of the form ± p e W ± - • - ± p e M. 

Random Multiples on Koblitz Elliptic Curves 

Let E / F 2 m be an elliptic curve defined over the field with 2 m elements, and 
20 let P e E(F2 m ) be a point on the curve. A number of cryptographic constructions 
require the computation of a multiple NP, where N has size comparable to 2 m . 
Writing N in binary form as 

N = No +2Ni + 4N 2 + • • • +2*Ni + • • • +2™N m with N 0 , ... , N m e {0, 1}, 

25 

the computation of NP is reduced to approximately N/2 doublings and N/2 
point additions. As already indicated, further savings may be obtained by 
choosing 

No, ... , N m in the set {-1,0, 1}, reducing the number of additions to 
30 approximately N/3. Unfortunately, on elliptic curves, doubling a point is 
computationally more difficult than adding two different points. 
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For certain elliptic curves, it is possible to significantly reduce the 
necessary computation by replacing doubling with a Frobenius map that is 
essentially free. Let E/F 2 be a "Koblitz curve", that is, an ordinary elliptic curve 
defined over the field with two elements. Thus, E is one of the two curves 

5 

E : y 2 + xy = x 3 + ax 2 + 1 with a e F 2 . 

Let 

x : E(F 2 -) -> E(F 2 -); x (x, y) = (x 2 , y 2 ) 

10 

be the Frobenius map on E. The computation of x(Q) takes very little time 
compared to point addition or doubling on E. It is possible to write any integer 
N as a linear combination 

15 N = No + tNi + x 2 N 2 + • • • + x'Ni + • • • + x m N m with No. ... , N m e {-1, 0, 1} 

and, then, the computation of NP is essentially reduced to m/3 additions in 
E(F 2 m ). (Approximately m/3 of the Ni's will be nonzero.) Further, for many 
cryptographic applications there is no real reason to use integer multiples of P; 

20 one can simply use multiples NP where N is a random linear combination of 
powers of x, as above. For example, Diffie-Hellman key exchange works 
perfectly well. See, D. Hankerson, et al., "Software implementation of elliptic 
curve cryptography over binary fields", in Cryptographic Hardware and 
Embedded Systems (CHES 2000), Q. Koc and C. Paar (eds.), Lecture Notes in 

25 Computer Science, Springer- Verlag (to be published); J. Solinas, Efficient 

arithmetic on Koblitz curves, Designs, Codes, and Cryptography 19 (2000), 195- 
249, for basic material and computational methods on Koblitz curves. 

To summarize, computation of a random signed x-multiple of a point on a 
30 Koblitz curve over F 2 m requires approximately m/3 elliptic curve additions. The 
present invention provides a way to significantly reduce the number of elliptic 
curve additions. As discussed above, in accord with the present invention, 
choose the multiplier N to be a product of low Hamming weight linear 
combinations of x. 



Page 17 



For concreteness, a particular field of cryptographic interest is 
illustrated. Let m = 163, so one is working in the field F 2 163 . Choose N to have 
the form 

5 N = N^N^N^= (l + E^O + E^OO + E^")- (4) 

u—l u=\ u=l 

(We take each factor in the form (4), because one can always pull off a power of 
t from each factor. Using this form prevents overcounting.) 

10 

First, given Q = NP, check the degree of difficulty to perform a search for 
N or for some other integer N' satisfying NT = Q. A square root search (e.g., 
Pollard rho) for N' takes on the order of V2 163 steps. A second search, which 
takes advantage of the special form of N, is to write the equation Q = NP as 

15 

(NI3>p Q = N(i)N( 2 )P 

and compare tables of values of the two sides. The time and space requirement 
20 for this search is the length of the longer of the two tables. For this example, 

each of the NW's is taken from a space of size 2 6 C(162, 6) « 2 40 4 , so the table of 
values of NUINPJP has 0(2 80 ) elements. Finally, one could try guessing the 
values of NU> and N< 2 ) and perform a square root search for N( 3 ), but this gives an 
even larger search space. 

25 

The advantage of taking N in the above form is clear. Computation of the 
multiple 

NP = N(i)N( 2 )N( 3 )P 

30 requires only 6 + 6 + 6 = 18 elliptic curve additions. (Subtractions are 

essentially the same as additions.) It also requires many applications of powers 
of the Frobenius map x, but these take very little time compared to point 
additions, so may be neglected in this rough analysis. Thus, it can be seen 
that, with method of the present invention, a useful cryptographic multiple NP 
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can be computed using 18 additions, rather than the approximately 163/3 « 54 
additions required by the earlier method. Thus, the present invention yields a 
3-fold speed increase. 

5 A meet-in-the-middle attack on all of N is not likely, but even if such an 

attack exists, it suffices to replace the Hamming weights (6, 6, 6) above with the 
weights 

(8, 9, 9) to get a set of triples N<u, NP>, NO) of order 2^3.9. The computation of NP 
now requires 26 additions, yielding a speed increase by a factor of 
10 approximately 2.1. Actually, in this situation it is even faster to use a product 
of four terms 

N = NmN( 2 )N(3)N< 4 > with weights (4, 5, 7, 8). Then, the total search space has size 
24 C(162, 4} - 25 C(162, 5) • 2? C(162, 7) • 2« C(162, 8) « 

15 

and the computation of NP requires only 24 additions for a speed increase by a 
factor of approximately 2.26. 

Alternatively, one can take N to be a sum of products of small Hamming 
20 weight terms. For example, N = NWNP) + N< 3 )N( 4 ) with the four terms having 

small Hamming weight. Of course, this allows a square root attack for the two 
halves of N by matching values of aP with values of Q - bP. 

Alternatively, N can be an actual integer, rather than a polynomial in x. 
25 Then, one can include conjugate terms. For example, an expression of the form 
x i + T m-i represents an integer, and it is a simple matter to compute and store a 
table of values of x i + % m ~ * for 1 < i < m/2. 

The NTRU Public Key Cryptosystem 
30 The NTRU public key cryptosystem uses truncated polynomials in the 

ring 

R = ( Z /q Z )t x J/( XN ~ !)• The encryption process includes computation of a 
product r(X)h(X) for a fixed public key polynomial h(X) and a randomly chosen 
polynomial r(X) having small coefficients. The decryption process similarly 
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includes computation of a product f(X)e(X), where e(X) is the ciphertext and the 
private key f(X) is a polynomial with small coefficients. For further details, see 
Hoffstein (1998), supra. 



5 In general, a computation a(X)b(X) in the ring R is a convolution product 

of the vectors of coefficients of a and b. The naive algorithm to compute this 
convolution is N 2 steps, where each step is an addition and a multiplication. (If 
a(X) has coefficients that are randomly distributed in {-1, 0, 1}, then, the 
computation takes about 2N/3 steps, where now a step is simply an addition or 
10 a subtraction.) Other methods such as Karatsuba multiplication or FFT 
techniques (if applicable) may reduce this to 

0(N log N) steps, although the big-O constant may be moderately large. 

Thus, in accord with the present invention, a small random multiple of 
1 5 h(X) can be computed as a product 



r(D(X) r(2)(X) • • • rW(X) h(X); 



where each r(>)(X) has only a few nonzero terms. Then, the amount of 
20 computation needed is proportional to the sum of the number of nonzero terms, 
while the size of the sample space is approximately equal to the product of the 
sample spaces for the r('). 



For example, let N = 25 1 and take 

25 

r(X) = r(D(X) r(2)(X) 



where r(i) and r( 2 ) are polynomials with exactly eight nonzero coefficients, four 
l's and four -l's. To avoid too much duplication, preferably r( i )(0) = 1, so only 
30 three of the l's are randomly placed. Then, the number of such r(X) 
polynomials is approximately 

C(250, 3) C(247, 4) • C(250, 3) C(247, 4) • ¥t « 295.94. if one tries to guess rd>(X) 
and then use a square root search for r( 2 )(X), this leads to a search algorithm of 
length approximately (C(250, 3) C(247, 4)) • V(C(250, 3) C(247, 4)) • % « 2 711 . 
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The computation of the product r(X)h(X) is reduced to approximately 16N 
additions and subtractions. Notice that r(X) itself has about 64 nonzero 
coefficients, so a direct computation of r(X)h(X) requires almost 4 times as many 
elementary operations. 

5 

A similar construction can be used for the NTRU private key f(X), leading 
to a similar computational speedup for decryption. 

Randomness of Small Hamming Weight Products 
1 0 There are many ways of measuring randomness known to those skilled 

in the art. For concreteness, let B N (D) = { binary polynomials of degree N - 1 
with D ones}. That is, elements of B N (D) are polynomials 

a 0 + aiX + a 2 X 2 + • - • + a N -iX N -i 

15 

with ai e { 0, 1 } and Za\ = D. As described previously, polynomials are 
multiplied using the convolution rule X N = 1 . 

Products of polynomials are subject to a natural rotation of their 
20 coefficients by multiplying by powers of X. In other words, any product can be 
rewritten as 

a(X) * b(X) = (X k * a(X)) * (X^ * b(X)) 

25 Such rotations are far from random, so preferably they are discouraged in the 
sample spaces. Thus, let 

B* N (D) = { a(X) = ao + aiX + • • • + a N -iXN-i e B N (D) : ao = 1} 

30 be the subset of B N (D) consisting of polynomials whose constant coefficient is 
nonzero. 

Compare the space of random binary polynomials B* N (D) with the space 
of products 
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P* N (di,d2) = { c(X) = a(X) * b(X) : a(X) e B* N (di), b(X) e B* N (d 2 ), c(X) e B* N (did 2 )} 

Notice that we are only considering polynomials a(X) and b(X) whose product 
5 a(X) * b(X) is binary. In practice, this can require generating a number of pairs 
(a, b) at random, multiplying them, and discarding the product if it is not of the 
appropriate form. 



How can one compare the set of products P* N (di, d 2 ) with the truly 
10 random set B*N(di,d2)? In general, the former set will be much smaller than the 
latter set, so that each element of B* N (di,d 2 ) is not equally likely to be hit by an 
element of P*N(di,d 2 ). Experimentally, elements of P* N (di,d2) generally have a 
unique representation as a product. Preferably, Hamming weight differences 
are used to determine the extent to which elements of P* N ( di,d 2 ) are randomly 
15 distributed in the space B* N (di,d 2 ). For any two binary polynomials a(X) and 
b(X), their Hamming weight difference can be defined to be 



HWD(a, b) = # { i : a; ^ bi } 



20 It is easy to compute the probability that a randomly chosen pair in 

B* N (D) will have a given Hamming weight difference. More precisely, for any 
fixed a e B* N (D), if the known constant coefficient is ignored, there are D - 1 
ones and N - D zeros. Suppose that b e Bn(D) has k of its ones in common 
with the ones of a. Then, 

25 HWD(a, b) gains D - 1 - k from the ones in a that are hit by zeros of b and it 
gains 

D - 1 - k from the ones of b that hit zeros of a, so HWD(a, b) = 2(D - 1 - k). 
Thus, the Hamming weight difference is always even and it will equal 2 * h 
when exactly 

30 D - 1 - h of the ones of a and b coincide. Dividing the number of ways that this 
can happen by the total number of polynomials, for a fixed a e B*n(D), the 
probability that a randomly chosen b e B* N (D) is Hamming weight distance 2 * 
h from a is given by 
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( 



D-l 




(5) 



Prob (HWD(a, b) = 2*h) = 



5 



It is more difficult to compute exactly the analogous probability for a randomly 
chosen b e P*N(di, d2), so a computer simulation was used. 10,000 polynomials 
were chosen randomly from the sets 



We computed the distributions of Hamming weight differences HWD(a, b) for all 
10 8 pairs (a, b) chosen from each of the sets B x B, B x P, and P x P. The 
1 5 results are listed in Table 1 , together with the theoretical expected value from 
the formula (5). It seems clear from the table that there is no discernable 
difference in HWD(a, b) in the various situations studied. 



10 



B = B* 25 i(64) and P = P* 25 i(8, 8) 
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Table 1 . Hamming weight difference probabilities 
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A General Formulation of Small Hamming Weight Products 

All of the above constructions can be formulated quite generally in terms 
of a ring R, an R-module M, and a subset ScR with the two properties: (i) the 
set S is "sufficiently large" and (ii) the computation of products r • m for r e S 
5 and m e M is "computationally easy." These properties are, to some extent, 
antagonistic to one another, because presumably the larger the set S, the 
harder on average it is to compute products r • m for r e S. 

One way to construct the set S is to choose a collection of smaller 
1 0 subsets 

Si, ... , Stcz R and let 

S = {n - • • r t : ri e Si, ... , r t e S t } 

1 5 Under suitable hypotheses, the size of the set S is approximately the product of 
the sizes of Si, ... , S t . Each S! has the property (ii). 

Let there be one particular element x e R such that the product x • m is 
easy to compute for every m e M. Then, S, are preferably selected from low 
20 Hamming weight polynomials in x; that is, Si preferably consists of all elements 
of r of the form 

xJ 1 + v 2 + • • ■ + XJ d 

25 for some fixed d = d, (or for some random d < Di for a fixed D x ). Of course, if it is 
easy to compute inverses -m, then one can increase the size of S 4 by using 

+13 1 +13' 2 ±13' 3 + • • • ±XJ' d 



30 



Similarly, if there are several easy-to -multiply elements xi, ... , x u e R, then, one 
can take low Hamming weight polynomials in the u "variables" xi, ... , x u , further 
increasing the size of the special sets Si. 
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Relating this general formulation to the examples discussed above, the 
following is noted: 

1 . For Powers in F 2 n 

The ring is R = Z, the R-module is the multiplicative group M = 
5 (F 2 n )*, and the special map x is the doubling (i.e., squaring) map x(a) = a 2 . 

2. For Multiples on Koblitz Curves 

The ring is R = End(E(F 2 m )) (i.e., the ring of homomorphism from 
E(F 2 m ) to itself), the R-module is M = K{F 2 m ), and the special map x is the 
Frobenius map 
10 x(x, y) = (x2, y2). 

3. For the NTRU Cryptosystem 

The ring is R = Z [X]/(X N - 1), the R-module is M = R (i.e., R acts 
on itself via multiplication), and the special map x is the multiplication- 
by-X map x(f(X)) = Xf(X). 
1 5 This illustrates how Small Hamming Weight Products apply to these particular 
situations and also illustrates the widespread applicability of the invention. 

The size of S, where the set S is the image of the map 

20 Si x S 2 x - - • x S t -» R, (ri, r 2 , ... , r t ) -> rir 2 - - - r t 

can be partially quantified as follows. In practice, it is usually not hard to 
describe a natural set T c R with the property that ScT and with the property 
that a random t- tuple (n, ... , r,) of Si x • • • x S t appears to have an equal chance 

25 of hitting each element of T. (Note: It may be difficult to rigorously prove that T 
has this property, but usually at least one can obtain experimental evidence.) 
Let Ni = | S, | for the size of the set Si and M = | T | for the size of the set T. Then, 
using elementary probability theory, those skilled in the art can estimate the 
expected number of distinct elements when NiN 2 - • • N t elements of T are 

30 chosen randomly with replacement. 



The present invention has been described in detail including the 
preferred embodiments thereof. However, it will be understood that, upon 
consideration of the present specification, those skilled in the art may make 



Page 26 



modifications and /or improvements within the spirit and scope of this 
invention. The techniques of the present invention provide significantly 
improved computational efficiency relative to the prior art techniques. It should 
be emphasized that the techniques described above are exemplary and should 
5 not be construed as limiting the present invention to a particular group of 
illustrative embodiments. 

The disclosures of all references listed herein are hereby incorporated in 
their entirety by reference. Additionally, the disclosures in the publications, D. 
1 0 Gordon, A survey of fast exponentiation methods, Journal of Algorithms 27 

(1998), 129-146 and D. Stinson, Cryptography: Theory and Practice, CRC Press, 
1997, are also incorporated by reference. 



